Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Disambiguation method of multi-feature fusion based on HowNet sememe and Word2vec word embedding representation
WANG Wei, ZHAO Erping, CUI Zhiyuan, SUN Hao
Journal of Computer Applications    2021, 41 (8): 2193-2198.   DOI: 10.11772/j.issn.1001-9081.2020101625
Abstract447)      PDF (1018KB)(486)       Save
Aiming at the problems that the low-frequency words expressed by the existing word vectors are of poor quality, the semantic information expressed by them is easy to be confused, and the existing disambiguation models cannot distinguish polysemous words accurately, a multi-feature fusion disambiguation method based on word vector fusion was proposed. In the method, the word vectors expressed by HowNet sememes and the word vectors generated by Word2vec (Word to vector) were fused to complement the polysemous information of words and improve the expression quality of low-frequency words. Firstly, the cosine similarity between the entity to be disambiguated and the candidate entity was calculated to obtain the similarity between them. After that, the clustering algorithm and HowNet knowledge base were used to obtain entity category feature similarity. Then, the improved Latent Dirichlet Allocation (LDA) topic model was used to extract the topic keywords to calculate the similarity of entity topic feature similarity. Finally, the word sense disambiguation of polysemous words was realized by weighted fusion of the above three types of feature similarities. Experimental results conducted on the test set of the Tibet animal husbandry field show that the accuracy of the proposed method (90.1%) is 7.6 percentage points higher than that of typical graph model disambiguation method.
Reference | Related Articles | Metrics